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This Monograph was written for the Conference on the New Instructional 
Materials in Physics , held at the University of Washington in the sua- 
aer of 1965. The general purpose of the conference was to create effec- 
tive ways of present lug physics to college students who are not pre- 
paring to becoae professional physicists . Such an audience aight include 
prospective secondary school physics teachers, prospective practitioners 
of other sciences, and those who wish to learn physics as one coaponent 
of a liberal education. 

At the Conference soae 40 physicists and 12 filaaakers and design- 
ers worked for periods ranging froa four to nine weeks. The central 
task, certainly the one in which aost physicists participated, was the 
writing of Monographs. 

Although there was no consensus on a single approach, aan> writers 
felt that their presentations ought to put aore than the custoaary 
eaphaais on physical insight and synthesis. Moreover, the trea taunt was 
to be "aulti-level" that is, each Monograph would consist of sev- 

eral secti ons arranged in increasing order of sophistication. Such 
papers, it was hoped, could be readily introduced into existing courses 
or provide the basis for new kinds of courses. 

Monographs were written in four content areas: Forces and Fields, 
Quantua Mechanics, Theraal and Statistical Physics, and the Structure 
and Properties of Matter. Topic selections and general outlines were 
only loosely coordinated within each area in order to leave authors 
free to invent new approaches. In point of fact, however, a nunber of 
nonographs do relate to others in conplener.tary ways, a result of their 
authors’ close, inforaal interaction. 

Because of stringent tine 1 ini tat ions, few of ’.he Monographs have 
been coapleted, and none has been extensively rewritten. Indeed, aost 
writers feel that they are barely aore than clean first drafts. Yet, 
because of the highly experinental nature of the undertaking, it is 
essential that these Manuscripts be nade available for careful review 
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by other physicists and for trial use with students. Much effort, 
therefore, has gone into publishing the* in a readable format intended 
to facilitate serious consideration. 

So many people have contributed to the project that complete 
acknowledgement is not possible. The National Science Foundation sup- 
ported the Conference. The staff of the Commission on College Physics, 
led by E. Leonard Jossea, and that of the University of Washington 
physics department , led by Ronald Geballe and Ernest M. Henley, car- 
ried the heavy burden of organization. Walter C. Michels, Lyman G. 

Parratt, and George M. Volkoff read and criticized manuscripts at a 
critical stage in the writing. Judith Bregaan, Edward Gerjuoy, Ernest 
M. Henley, and Lawrence Wilets read manuscripts editoria.lly . Martha 
tnd Margery Lang did the technical editing; Ann Widditsch 
supervised the initial typing and assembled the final drafts. James 
Grunbaua designed the format and, assisted in Seattle by Roselyn Pape, 
directed the art preparation. Richard A. Mould has helped in all phases 
of readying manuscripts for the printer. Finally, and crucially, Jay F. 

Wilson, of the D. Van Nostrand Company, served as Managing Editor. For 
the hard work and steadfast support of all these persons and many 
others, X am deeply grateful. 

Edward D. Laabe 
Chairman, Panel on the 
New Instructional Materials 
Commission on College Physics 
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One hears that seemingly stationary 
matter is really composed of atoms in 
violent motion. Somehow, the motion is 
caused by heat. It is often called 
"heat motion," but the name simply begs 
the question: Is "heat" just another 
word for motion? As the temperature is 
reduced the heat motion becomes less 
violent. Is heat different from tem- 
perature? At absolute zero the atoms 
are . . . motionless? The last question 
is usually especially puzzling tc the 
student, who may feel that "absolute 
zero" has a forbidding sound. Some 
have heard that absolute zero is unat- 
tainable, which is forbidding enough. 

How do we know that atoms have 
heat motion, and how does the motion 
affect what we can feel and measure? 
What happens near absolute zero? The 
answers are contained in the branches 
of physics known as kinetic theory and 
statistical mechanics. In this mono- 
graph the subjects are introduced by 
studying heat motion in a gas. 

We begin with atoms, as Leucippus 
and his student Democritus (460-370 
B.C.), assumed them to be, indivisible 
and unchanging. The atomic hypothesis, 
as it was formed into a philosophical 
system by Epicurus (341-270 B.C.), was 
the inspiration for the poem De Rerum 
Natura of Lucretius (died ca. 55 B.C.), 
who viewed all of inanimate nature, 
life, and society as increasingly com- 
plex systems progressively developed 
from the natural laws governing atoms. 
This poem has been called a versified 
textbook in atomic physics, and it pre- 
sents a compelling unified view of na- 
ture. But Aristotle (383-322 B.C.), who 
had the greatest influence on the nat- 
ural philosophy of later Greeks, found 
the atomic basis of the universe and 
our perceptions of it distasteful and 
incompatible with his own conviction 
of divine purpose. His criticism of 
the atomists seemed sufficient reason 
to the medieval philosophers for re- 



jecting their theories, and Lucretius' 
poem was banned. By the time it was 
made widely known again, by the Commen- 
taries of Gassendi (1592-1655) , two 
thousand years had passed since Demo- 
critus. 

Publication of the Commentaries 
revived the ancient discussions on the 
nature of air, which had played an im- 
portant role in the Greek atomists' 
speculations on the physical influence 
of unseen things. Gassendi himself 
helped to set the stage for the modern 
study of gases. His atoms move freely 
in all directions, accounting for the 
free diffusion of the gaseous state in 
apparent Violation of gravity. The par- 
ticles have mass, and particles of like 
matter have like mass. Thus, the den- 
sity of a gas at a particular pressure 
depends only on the average number in 
a unit volume. Being free to move in 
any direction, they collide with all 
walls and partitions. If a partition 
is moved in, so as to decrease the 
space available to their movements, 
the rate of atomic collisions with the 
partition increases, and therefore more 
force will be necessary to oppose their 
impact. With this qualitative model, 
Gassendi prepared the atomic hypothe- 
sis for the quantitative researches of 
Boyle (1627-1691), and Hooke (1635- 
1703), on the elasticity of gases. 

With this cursory view of its histori- 
cal origins, we will begin our study 
at this point, in a simplified treat- 
ment of the gas laws. 

Boyle (or Hooke), discovered, in 
1660, that the pressure of a gas in a 
container varies inversely with its 
volume, so that the product PV for any 
given body of gas is a constant as the 
volume is changed. (It has since been 
named ‘Boyle 's law.) Daniel Bernoulli 
gave a quantitative explanation, in 
terms of atoms, in 1738. A simple der- 
ivation is as follows: We imagine a 
cubical container filled with a large 
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Fig. 1.1 Diagram of an oversimplified model 
of molecular motions in a cubical volume. 

number of identical atoms. At first we. 
imagine that all the particles move 
with the same speed v back and forth 
in each direction, as illustrated in 
Fig. 1.1. Each atom travels back and 
forth between faces A and B, its mo- 
mentum changing by 2mv at each colli- 
sion (its mass is m) ; the time between 
two successive impacts on each face 
being 2L/v. Hence, the impulse, or 
change of momentum experienced by all 
N particles at each wall is 

change in momentum m jj 2mv m jj mv* 

elapsed time 2L/v L 

( 1 . 1 ) 



in unit time. This rate of change of 
momentum is accomplished by the appli- 
cation of forces at each collision, 
and clearly it is the walls that sup- 
ply the forces: According to the New- 
tonian principle of equal action and 
reaction, the forces on the particles 
and on the wall are equal in magnitude, 
and opposite in direction. Thus, as 
the particles are turned back into the 
container at each collision, the wall 
is pushed outward* The total reaction 
on the wall ia the total offeot of a 
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st rain of impacts. Under most con- 
ditji^ns, all that can be detected is a 
time-averaged force. The average force 
is equal to the average rate of change 
of momentum "delivered" to the atoms 
by the wall: It is just the value in 
Eq. (1.1) above, because the force is 
imparted by many impacts it also seems 
to be distributed uniformly overthe 
surfaces of faces A and B. A unit area 
of each face experiences an average 
pressure P equal to the force per unit 
area of the face: 




where V - L* is the volume of the con- 
tainer . 

By using an extremely artificial 
model we have obtained the property 
PV - constant, for faces A and B. But 
what happens at the other faces? Our 
next step is to improve the theory by 
using a more realistic model. Instead 
of assuming that all. atoms move back 
and forth between faces A and B, we 
should not expect that these two faces 
will be singled out, but that all six 
faces are equivalent, all experiencing 
the same average pressure. One way of 
establishing the equivalence would be 
by assuming that one third of all the 
atoms move back and forth between A 
and B, one third between top and bot- 
tom, and the one third between the re- 
maining 'pair. In this way, all faces 
experience the average pressure 



P 



1 Nmv* 

3 V * 



Again, we can see that the model 
is unrealistic and can be improved. 

We^ should expect that atoms will move 
not only along the three directions 
parallel to the sides, but will also 
take all the intermediate directions, 
with arbitrary velocity components v*, 
v y , v b* In a Cartesian coordinate sys- 
tem. The axes of the coordinate system 
can be oriented in any manner relative 
to the sides of the container. The to- 
tal behavior of all the particles will 
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still be uniform in all directions: 
There is no preference for one direc- 
tion over any other. We can imagine an 
"average atom" which traces out a com- 
plicated path which samples all direc- 
tions in succession, so that its aver- 
age velocity components have equal 
magnitudes : 




(1.4) 



Since the total velocity is the vector 
sum of the three Cartesian components, 



V* - V x # + Vy* + V g * 



The pressure on each face, due to the 
averaging over all directions of the 
atomic velocities, is the same as in 
Eq. (1.3). Thus, we see that the pres- 
sure is uniform on all faces, eveu if 
the atoms move in random direct idns, 
provided that all directions are 
equivalent . 

Equation (1.3) can be rewritten 
in terms of the kinetic energy e of an 
individual particle. Since € - £mv*, 

PV - | He - | U klB (1.5) 

where U^,, is defined as the total ki- 
netic energy of all the particles. 
Therefore, the product PV is constant 
for a constant total kinetic energy of 
the gas enclosed, as long as all par- 
ticles have the same speed. 

Clearly, this last assumption is 
unreasonable. We cannot accept the hy- 
pothesis that a gas of randomly moving 
atoms has only one characteristic speed. 
More likely, there are collisions be- 
tween atoms which cause some to slow 
up and a few to gain most of the energy 
of the two; that is, there is probably 
some randomness in speeds as well as 
in directions. However, we might ex- 
pect that there is some average 1 speed 



‘The symbol < > le usod quite generally in phys- 
ics to denote sn average quantity. Somewhat 
less of ton, a bar over the quantity Is alterna- 
tively used. For example, the average speed can 
be written <V> or v. 



of the atoms that is unchanging. If 
the average molecule has an average 
speed then it has an average energy 
(c) - §m(v # ), and Boyle's law is still 
satisfied: 

PV - I »<«> - I ®kl.. (»•«) 

where the total kinetic energy U kln 
-N<€>. 

A simple extension of these argu- 
ments leads to another important re- 
sult, known as Dalton's law. Suppose 
that the quantity of gas is composed 
of two types of atoms. The pressure at 
the walls is a result of the impacts 
of both varieties, which we take to 
have numbers N 4 , N # , and average ener- 
gies <c 8 ). Then, the product PV 

is, after Eq. (1.6), the result of 
both types of atoms bombarding the 
walls, 

PV -§»,<«,) + §N,<«,> 

which we can write in terms of two 
partial pressures Pj and P*, where Pj 
and P 2 - (2/3)N <€ 4 ), etc. If more va- 
rieties are present, the total pres- 
sure P is the sum of all partial pres- 
sures, where each partial pressure is 
equal to the pressure that each type 
would have if it alone occupied the 
container. Thus, for v different atomic 
species, the total pressure is 

i 

A 2 

P " 2- p i» where Pl ■ o N ( € 1>« 

1=1 

(1.7) 



Although we have derived the 
pressure equation for the special case 
of a cubical container, it can be eas- 
ily extended to arbitrary shapes. For 
we can imagine any shape composed of a 
series of identical cubes, each filled 
with the same number of identical ato&ts 
having the same average speed (v) . 

Every pair of walls of adjacent cubes 
would be subject to the same average 
pressure. Therefore, the internal walls 
are really not necessary, since they 
provide no net force, and there is no 
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discernible difference between two 
atoms rebounding from opposite sides 
of a wall and their passing through the 
wall without collision, with each one 
moving away from the wall into the 
other atom’s container. Therefore, re- 
moving all inside walls will not change 
the pressure, since the total number of 
particles per volume N/V of the large 
composite volume is the same as for 
each of the component sub volumes. 

Soon after the discovery of 
Boyle's law it became evident that it 
was only conditionally obeyed. Amontons 
observed in 1702 that the empirical 
constant increased if the quantity of 
gas was ;varmed . But the manner of its 
change could not be specified quanti- 
tatively without some manner of meas- 
uring how much warmer. Although crude 
thermometers had been invented by Gali- 
leo and others*., no satisfactory instru- 
ment existed until Fahrenheit (1686- 
1736) developed a mercury thermometer, 
and established a temperature scale 
determined by the boiling points of 
liquids. With its aid, Charles (1746- 
1823) showed in 1787 that the constant 
of Boyle's law varies linearly with the 
temperature. That is, according to the 
temperature t on the temperature scale 
defined by a mercury thermometer, the 
variations of pressure and volume of a 
fixed quantity of gas can be described 
by the equation 

i 

PV oc (i + at), (1.8) 

— MERCURY SCALE 




Fig. 1.2 Illustrating the relation between 
the temperature scales according to the mer- 
cury thermometer and the expansion of gases. 



/ 

whore C and a are constants. Charles’ 
law can be used to establish a more 
convenient tomporaturo scale, defined 
by the gas itself. If wo dofino a now 
temperature T in terms of the mercury 
thermometer’s scale by the relation 

T oc 1 + at, 

then Charles' law becomes 

PV = CT. (1.9) 

where C is a constant . This simple 
transformation of the temperature 
amounts to a shift of the zero of tem- 
perature, illustrated in Fig. 1.2. As 
it was elaborated by Gay-Lussac (1778- 
1850), find others, the medern form of 
the gas law is given in terms of the 
number of atoms N and a universal con- 
stant k, the Boltzmann constant: 

PV - NkT . (1.10) 

The Boltzmann constant k » 1,380 
X 10” 1 * erg/atom °K is related to the 
"gas constant" R by Avogadro’s number 
N a - 6.025 X 10 23 atoms/mole and the 
number of "moles" n n : 

Nk - n„N A k * n m R; 

R «= «.317 x 10 7 erg/mole °K (1.11) 

k and R are experimentally determined, 
and have been found to be universal 
constants, independent of the chemical 
species or mass of the gas. As long as 
the gas is not too dense or cold, the 
gas law provides an accurate descrip- 
tion of molecular gases as well. It 
can be seen that an experimental law 
such as Eq. (1.10) might be used as 
ta® basis for a universal temperature 
scale, since it does not depend on par- 
ticular details of the experimental ap- 
paratus. It is actually used for the 
establishment of an absolute tempera- 
ture scale. The experimental and theo- 
retical details of its establishment, 
although of fundamental importance in 
thermodynamics and statistical mechan- 
ics, is tangential to the main purpose 
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of this Monograph. Interested students 
May read an extensive and lucid de- 
scription in Heat and Thermodynamics , 
by M. V. ZeMansky (see bibliography). 

The gas law, established by over 
two centuries of experiments, put into 
its Modern forM, Eq. (1.10) leads us to 
a Microscopic interpretation of tenper- 
ature. In Eq. (1.6), we obtained an 
atonic basis for Boyle's law, in terns 
of the average kinetic energy (c) of 
the particles. It now seems, that if 
the product PV is found to vary with 
the absolute tenperature, it nust be 
that the average energy of the atons 
is changed accordingly: conparlng Eq. 
(1.6) and Eq. (1.10), we have 

<€>-|kT. (1.12) 

Thus, with a combination of experimen- 
tal fact and theoretical argument based 



on the atonic hypothesis, we have cone 
to a beautifully simple and profound 
insight into the heat notion of the 
atons of a gas. However, at this stage 
in our developnent, we cannot say that 
Eq. (1.12) is "proved," since it is not 
a purely nathenatical theory. Its ulti- 
mate test can only be done by actually 
Measuring the average energy of the at- 
oms, and finding that they obey the 
equation. That test will in fact be the 
clinax of our story. But before that, 
we will improve our theoretical nodel 
of the heat notions, to develop a dis- 
tribution law for the velocities. This 
will lead us away from our direct con- 
cern for a while, to a description of 
statistical distributions in nany other 
systens, and a short introduction to 
nathenatical probability theory. Ve 
will return, in Chapter 4, to the mi- 
croscopic nodel of a gas, and apply 
statistical methods to its analysis. 
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There 's a con>n thread in the mathe- 
matic i lied svatistics, which we use 
to study such widely diverse fields as 
gambling, quantitative measurements, 
biological variation, and the physical 
nature of natter. Although we want pri- 
marily to understand its part in phys- 
ics at this nonent, the roles that sta- 
tistics plays in the other areas are 
also fascinating. Moreover, by study- 
ing its appearance in many guise re 
can hope to understand better how it 
works in one, aod as we see it in each 
additional situation it shows us other 
aspects, becomes more fully fleshed, 
and familiar. In this chapter, we only 
talk about the statistical laws but cio 
not use mathematics; in the next, the 
mathematical forma are derived and 
analyzed . 



2.1 GAMES OF CHANCE 

Successful gamblers know at least 
the simplest facts about statistics in 
their bones. They know that the laws of 
large numbers include not only the most 
elementary, that averages of groups of 
many identical things will agree, more 
or less, but also that deviations are 
bound to occur. Improbable runs of luck 
are possible; they are what makes the 
game interesting. Furthermore, improb- 
able runs are bound to happen, if one 
plays long enough, and the frequencies 
of their appearance can be estimated. 
The expectation of probable and improb- 
able runs to be found in the game can 
be determined from the distribution 
function of its results. The distribu- 
tion function is simply a complete rec- 
ord of a very large number of sample 
games, so that all of the improbable 
events have had a chance to appear, 
and all of the more likely ones to ap- 
pear many times, to establish a well- 
defined average behavior. If we have 
studied one group of many games, we 



know how future gioups tend to work 
out; the distribution function is a 
proper cy of the game , not the particu- 
lar set of contests on which we base 
the statistical analysis. If the sam- 
ple set contains few entries, it may 
be unrepresentative, containing an un- 
usual proportion of unlikely results; 

To be characteristic of the game it- 
self, the sample set should contain an 
infinite number of contests. Whether 
one tries to achieve very large sets 
in practice or not, the concept of 
probability involves the idealization 
that the Infinite set has been played, 
or could be played; and the a priori 
probability of a particular result is 
the fraction of times that this result 
would appear. We say that the probabil- 
ity of a well-balanced and well-tossed 
coin coming up heads is 1/2; in a 
sonse, that probability implies a defi- 
nition for "well-balanced and well- 
tossed." We don't know tha ‘ it is so 
until an experiment is done, but until 
it is, we accept the a priori probabil- 
ities themselves as representing the 
Infinite set of trials. 

It is amusing to read of ihe his- 
toric experiments designed to check the 
calculations based on a priori proba- 
bilities. The naturalist Buff on, as- 
sisted by a child tossing a coin in the 
air, played 2,048 partis of the Peters- 
burg game, in which a coin is thrown 
successively until a parti is brought 
to an end by the appearance of heads. 

A Swiss astronomer studied the behavior 
of dice in several long sequences, 

. which he analyzed for all of their dis- 
tinguishable combinations; he wrote 
that altogether, in the course of his 
life, he had made 280,000 casts of in- 
dividual dice. After he had completed 
the major part of his investigations, 
he found that the results were very 
different from the predictions of the- 
ory; some combinations were signifi- 
cantly more frequent than the a priori 
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probabilities implied. So the only re- 
sult of his labors was that he learned 
that the dice were irregular, and that 
the a priori probabilities he had as- 
suited were incorrect. 

As a Mathematical exercise, the 
dice experiment was a failure, but as 
an investigation into the properties 
of his dice, it was a reasonable way 
to behave. It was also analogous to the 
way many physical experiments have pro- 
ceeded. The physical systems we study 
are composed of many basically identi- 
cal objects, such as molecules, as 
alike as casts of the same die. Each 
molecule can be in one of a variety of 
states, as the die can show one of 
several faces uppermost. A collection 
of many molecules is like a set of many 
casts, and the sets of results can be 
related to the properties of an indi- 
vidual molecule or to an individual 
cast by the mathematics of statistical 
theory. Distribution functions of phys- 
ical properties give us clues to the 
structure of matter, just as distribu- 
tion functions of gambling results show 
us some aspects of the game. 

There are still stronger parallels 
between physical systems and games of 
chance, and those parallels are sug- 
gested by the similarities in the dis- 
tribution functions. Ve can begin to 
develop an insight for the parallels by 
exploring some qualitative features 
that games of chance have in common. 
Coin-tossing and dice are not the only 
games of chance, and other games have 
their distribution functions as well. 
Allowing for changes in scale, and 
sometimes a regular distortion, they 
all show some strong resemblances. It 
is not very surprising that they should 
resemble each other in a crude way; 
after all, every game has a normal, or 
most expected result, and the rest of 
the results are less and less likely, 
so that they each have a curve with 
some sort of peak, with decreasing 
heights for the results progressively 
different from the norm. But many re- 
semble each other in more detail than 
that; these have the same mathematical 
form, called the binomial distribution. 



The binomial distribution is basic 
to all games of chance. Their fundamen- 
tal similarity is due to the nature of 
pure chance, in which every event is 
alike, and no event is influenced by 
those that come before or after. Each 
•timo the dice arc thrown, the chance of 
a particular result docs not depend on 
whether that number has been appearing 
unusually often. Since each result is 
independent, there is no way for a run 
of luck to prolong itself, and the run 
will continue or cease according to the 
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results of each successive throw, hav- 
ing the saae chance as any other cast. 
All gaaes of chance are this way, and 
if we allow for variations in their 
designs, such as the number of possible 
results of a single event, they have 
similar distribution functions. The 
basic distribution is the binoaial; 
under certain conditions, it takes on 
foras which are easier to use for coa- 
pu tat ions. The two derived foras aost 
coaaonly used in physics are the Pois- 
son and the noraal (or Gaussian), dis- 
tributions. 



2.2 PRECISION MEASUREMENTS 

How can it be that these aathe- 
aatical foras have a place in physics? 
Science and gaabling seen worlds apart; 
one representing reason, and the other, 
the coaplete lack of it. Is it only an 
accident that their aatheaatlcd show 
soae reseablances? Actually, it is no 
accident, since there are aany ways in 
which chance plays an iaportant role in 
science, and when it does, its signa- 
ture is found in the presence of one of 
the characteristic distribution func- 
tions. 

When they do appear, it is often 
in the aost intentional and logical 
operations. For example, one of the 
aost precise technical aanipulations 
one can do today is to aeasure a phys- 
ical quantity. The aeasureaent of the 
length of a rod, if repeated several 
tiaes, will give a set of slightly dif- 
ferent answers. Of course, aore precise 
instruaents will give answers which 
cluster together aore closely: Measure- 
aents taken with a aicroaeter will show 
less scatter than a series obtained 
with a ruler. Each set of aeasureaents 
will have an average, the "length” of 
the rod, and each set will contain val- 
ues which becoae less numerous as they 
aove further froa the average. If the 
aeasureaents are done without system- 
atic errors (such as squeezing the rod 
too hard with the aicroaeter or always 
reading the ruler at a slant), both 
sets of aeasureaents giro the same 



average value. That is, the ruler val- 
ues nay average to 0.69 inch, say, and 
the aicroaeter values average to 0.6883 
inch: They agree, if we allow for the 
lower precision of the ruler. The ruler 
values have a wider spread, and we nay 
find that several are greater than 0.71 
inch, while there are no aicroaeter 
values greater than 0.690 inch. The 
spread, or dispersion , of "wrong" re- 
sults varies according to the instru- 
ment and to the way it' is used , but 
allowing for the difference in preci- 
sion, both set > of results tend to have 
very similar shapes. They are usually 
approximately Gaussian, at least over 
their central regions, with only an 
average value and a dispersion to dif- 
' ferentiate one set from another. And 
now, there are no parameters left which 
might refer to the nature of the rod cr 
the instrument with which the data were 
obtained. 

The Gaussian distribution is good 
not only for aeasureaents of the length 
of a short rod by rulers and microme- 
ters, but for any . .^agths and dis- 
tances, and for Instruaents of any pre- 
cision. There is not even a parameter 
to describe what sort of physical quan- 
tity is being measured; the statistics 
of "wrong" answers is quite independent 
of the nature of length. It is just as 
true of mass, and the preceding de- 
scription could be immediately extended 
to describe a aeasureaent of aass sim- 
ply by changing a few words: "aass" for 
"length," and perhaps "spring scale" 
for "ruler" and "analytical balance" 
for "aicroaeter." And it is true of all 
physical aeasureaents, no matter how 
sophisticated the experiment. The pre- 
cision of soae physical measurements 
is now greater than 1 part in 10 bil- 
lion: it is just as true for them. Of 
course, there are limitations on such 
a general rule. There must be many 
possible "wrong" answers froa the re- 
sult of a single aeasureaent for the 
distribution to be Gaussian. Otherwise, 
the distribution is composed of dis- 
crete possibilities - the binomial dis- 
tribution is more appropriate in that 
case, as it is for coin-tossing and 
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dice. The errors must also be truly 
random, with no especially large or un- 
usually probable errors of a particular 
kind or magnitude. "Random" is almost 
a definition of the pattern of errors 
that follows the simple statistical 
distributions. We can interpret the 
mathematical similarity between the 
statistics of physical measurements and 
games of change in two ways: in one 
sense, a crapshooter is making a series 
of measurements to establish the aver- 
age behavior of his dice; in another 
sense, it is a gamble that random vari- 
ations will cancel out when we make a 
physical measurement. Actually, the 
signature of one of these distributions- 
is usually highly prized by the re- 
search worker or statistician, because 
it indicates that the data has probably 
been collected without bias: even the 
"bad" results were kept. 

2.3 QUALITY CONTROL 

But who is to tell us what is 
'bad'’? The variation belong to all as- 
pects of the measurement: the instru- 
ment, its use, and the measured quan- 
tity itself. They spring from causes 
ranging from the most trivial, such as 
careless work, to the most complex, 
such as are due to the nature of the 
thing measured. All of these variations 
contribute to the total spread of the 
measurements, and there is no way of 
examining a single set to determine 
what variations are due to a poor in- 
strument or are characteristic of the 
thing measured. For the single set of 
many experiments, all of the random 
scatter is real , and there are no 
"wrong" measurements. (Not all varia- 
tions are random, however - sometimes 
unusual events interfere with the 
course of a measurement, and cause some 
values to deviate from the norm by much 
larger amounts than expected. When this 
happens, the deviant result may be 
tested against a standard distribution, 
and it may be discarded if it fails to 
meet quantitative criteria.) Sometimes 
it is apparent that most of the scatter 
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Fig. 2.2 Hyperfine structure separation of 
the ground state of deuterium, measured by 
atomic bean resonance by A. G. Prodell and 
P. Kusch (Phys. Rev. 88, 184 (1952)). Histo- 
gram showing the distribution of 350 meas- 
urements of the frequency, together with a 
Gaussian error curve. 



of results is due to a poor instrument, 
and that the measured quantity is much 
more uniform than the data indicated. 
But unless the measurements are re- 
peated with a more precise instrument, 
there is no way of reading coarse data 
to tell which errors are instrumental. 
However, if more coarse measurements 
are made, their average becomes more 
reliable, and in the limit of an infi- 
nite number of measurements, it is un- 
affected by the random errors of the 
instrument and its operator. 

If, instead of measuring the 
length of a single rod, we measure the 
lengths of many apparently identical 
rods, the measurements show wider vari- 
ations. The variations can be attribu- 
ted to real variations in the lengths, 
independent of the experimenter or his 
micrometer. Variations can be perceived 
even among supposedly identical rods 
made by the same automatic machine. 
Whatever errors contribute to the vari- 
ations are errors in the way the ma- 
chine repeats its manufacturing cycles. 
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These variations, which are due to many 
imperfections, produce irregular prod- 
ucts whose properties are often found 
to fall on Gaussian distribution 
curves. No matter what property is con- 
sidered, or how standardized the prod- 
uct, examination usually discloses sim- 
ilar distributions. They come from the 
same fundamental behavior of random var- 
iations as those of repeated measure- 
ments, except that in these canes, one 
cannot attribute the variations to hu- 
man mistakes. They are also a feature 
of automation, and the statistical anal- 
ysis and limitation of the resulting 
imperfections form the basis for the 
fields of "quality control," "process 
control," or "industrial statistics." 




Fig. 2.3 85 rolls of tape classified by ad- 

hesion values. After John D. Heide, Indus- 
trial Process Control by Statistical Methods 
(McGraw-Hill Book Company, New York, 1952). 




Fig. 2.4 Heights (excess over 1 m) of 767 
six-year-old boys, in a group selected for 
uniform arm length. After Paul Peach, An 
Introduction to Industrial Statistics and 
Quality Control (Bdwarde and Broughton Co., 
Raleigh, 1945). 
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In many cases the manufacturing 
process can be seen to be imperfect . A 
shaft may have worn bearings, a cutting 
tool may chatter against the work, or a 
switch contact may be dirty. When such 
defects are corrected the machine makes 
products of greater uniformity . But 
variations will still be present, due 
to minor defects in the machine, and 
these variations will still fit a dis- 
tribution curve, although one of smal- 
ler dispersion. The process of machine 
improvement is a never-ending spiral 
toward smaller variations and a narrow- 
er range of results, but they are still 
descrlbable by a distribution. 

Somewhere along the line of de- 
creasing errors one finds a limitation 
in the nature of the materials of the 
machine and the product. For example, 
one of the technical properties of met- 
als and alloys is their ability to 
withstand repeated bending and flexing. 
The bending produces a network of crys- 
talline dislocations, microscopic 
cracks, and eventually fracture. Dif- 
ferent alloys have enormously different 
resistances to this sort of fatigue, 
and their resistances are obviously a 
matter of great technical importance . 

In many schools and laboratories there 
are machines which test standard sam- 
ples by bending them repeatedly until 
they break. They work under controlled 
conditions, with fixed angle of bend, 
radius of curvature, number of bends 
per minute, and so forth. With every 
significant variable kept uniform, ap- 
parently identical samples have mark- 
edly differing lifetimes: The lifetimes 
fall on a distribution curve of consid- 
erable width. There seems no way to 
make the pieces break after an iden- 
tical number of cycles: the scatter 
is in the nature of the samples, in 
the random processes of the atomic 
motions and arrangements that lead to 
the microscopic dislocations and 
cracks. If there were a way to make 
them all show fatigue and crack at the 
same point , one would be able to save 
considerable sums of money by replacing 
parts Just before they were able to 
erack. Just such a supposition, that a 
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good metallurgist can predict when par- 
ticular pieces ^.aircraft wings), were 
due to break, xormed the central theme 
of a popular novel and movie. It was 
good entertainment, but terrible phys- 
ics. And yet, it is almost reasonable 
to expect that all of the scatter can 
some day be eliminated. Years ago, var- 
iations among samples were much 
greater. Looking back, we know that 
they were caused by gross variations in 
alloy composition, heat treatments, and 
machining. Modern metallurgical prac- 
tice has placed all of these variables 
under much stricter control , and the 
distribution functions of the alloy 
properties have responded by becoming 
much narrower. It is now difficult to 
make them still narrower, since the 
relatively easy controls have already 
been applied. If very grea* improve- 
ments are still to be iuu:^, it seems 
unlikely that they will be accomplished 
simply by improving our present manu- 
facturing techniques, but will have to 
depend on quite novel and sophisticated 
procedures, which will control even the 
pattern and density of crystalline im- 
perfections in each sample. Until then, 
it seems that the variations among sam- 
ples can still be classed as accidental 
errors in their fabrication, either 
directly due to the people who design 
and control the methods and machines, 
or to imperfections in the machines 
themselves. Gaussian and similar dis- 
tributions of the properties of the 
samples could then arise in essentially 
the same way they do in a series of 
physical measurements, out of the ran- 
domness of errors and cancellations of 
their effects. 



2.4 BIOLOGICAL VARIATION 

If the distributions in the prop- 
erties of artifacts were only due to 
human or mechanical errors, we should 
expect to see very different patterns 
among natural objects. There are many 
types of essentially identical things 
that have natural, or nonhuman origins. 
Some of the most familiar are living 
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forms of the same species. They are as 
alike as peas in a pod. How about peas? 
Mendel used them for his studies of in- 
heritance. The regularities he discov- 
ered are expressions of the multinomial 
distribution, an extension of the bi- 
nomial for the case in which more than 
two results of a trial are possible. 

The statistics of pea colors pointed 
toward the mechanism of inheritance, 
and we now understand how the random 
encounters of pairs of chromosomes lead 
to the regularities of the Mendel ian 
laws. Within a single genotype there 
are remaining variations in other char- 
acteristics: size, weight, firmness, or 
germination time, and these variations 
are usually Gaussian. As it is for 
peas, the same distribution law occurs 
throughout the biological world; no 
form of life is exempt, not even our 
own. There are tons of statistics on 
human measurements, ranging from the 
ordinary sort of height, weight, and 
so forth (see Fig. 2.4), to micro- 
scopic, physical, and chemical studies 
requiring highly specialized instru- 
ments. Whenever a large number of sam- 
ples is obtained, the familiar mathe- 
matical forms emerge. Normal distribu- 
tions characterize nonphysical quali- 
ties as well, such as I.Q. and test 
scores. When a class is graded on a 
curve, that curve is usually an approx- 
imation to one of the simple distribu- 
tion laws. In fact, the branch of math- 
ematics called statistics has origins 
closely associated with the analysis of 
human populations. Among the early 
studies were the proportion of a popu- 
lation that could be expected to bear 
arms, life insurance tables, and the 
immunity to smallpox. Sometimes it is 
easy to understand how a particular 
phenomenon, which has its basis in ac- 
cident, leads to the statistics of 
chance. For example, an analysis of the 
number of deaths in given years by 
kicks from a horse disclosed a Poisson 
distribution. Yet, the same distribu- 
tion describes the rate of telephone 
calls arriving at a central switch- 
board. And even the apparently inten- 
tional and organized actions of society 
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seem to follow the flame pattern: the 
numbcvH of outbrenkH of war por year 
from 1820 to 1939 have been found to 
follow a Poisson distribution remark- 
ably well. 

Observations of binomial, Poisson, 
and Gaussian and related distributions 
in biological populations is one thing, 
but understanding why is another. It is 
not at all obvious that the "Law of Er- 
ror" should also describe the varia- 
tions among biological properties. Its 
interpretation caused a controversy 
among scientists and mathematicians 
that lasted over a century . The argu- 
ments are worth describing here, be- 
cause some of them help to illuminat-e 
the mystery of the same law's appear- 
ance in physics. 

In 1835 the Belgian astronomer 
Quetelet published "Essai de Physique 
Sociale, " with which he gave a substan- 
tial basis to the new theory of statis- 
tics. He was the first to present con- 
vincing arguaitots for what has been 
called the constancy of large numbers , 
with particular application to men. The 
success of Quetelet 's statistics com- 
bined with the contemporary rationalism 
to convince him that he had demon-, 
strated that society obeyed eternal 
and natural laws. As the measurements 
of a physical quantity were scattered 



aoout tho true value because of error t 
so tho ideal biological model h woro 
thought to bo imperfectly copied by 
nature, which, in his view, always 
strives for perfection. Quetelet' s 
ideas were comfortable for his society 
and his time: It was an age of supreme 
reason and natural perfection. But as 
the philosophy of society relaxed to 
one of greater liberalism, new inter- 
pretations were made. In the debate, 
which still echoes, tho distinctions 
between "error" and "variation" have 
been blurred. 

In physics we are on ground fur- 
ther removed from ideas of perfection 
and purpose, errors, and mistakes. The 
familiar distribution laws seem to be 
everywhere. The velocities of th# - atoms 
of a gas are found to be Gaussian in 
any direction; so are the velocities of 
the conduction electrons in a semicon- 
ductor. The numbers of atoms of a ra- 
dioactive bit of matter that decay in 
each second fellow a Poisson distribu- 
tion law; so does the number of mole- 
cules of hydrogen in each cubic centi- 
meter of interplanetary space. Instead 
of lengthening this list of examples 
from physics, we will end the discus- 
sion of the statistical laws here, to 
examine their basis by mathematical 
analysis . 
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INTRODUCTION TO THE 
MATHEMATICS OF PROBABILITY 



Coin-tossing is one of the simplest 
examples of systems of random events, 
and yet it can lead us to an under- 
standing of important and current 
fields of physics. The theory is read- 
ily adaptable to phenomena as diverse 
as the magnetism of solids, the evap- 
oration of liquids, and the structure 
of alloys. These phenomena share a re- 
liance on microscopic units which can 
be idealized as having only two possi- 
ble states of conditions of some sort: 
H, + — , 0 #, or heads-tails. By sim- 
ple extensions of the mathematics, it 
will be possible to analyze phenomena 
springing from more than two states, 
and these will include the molecules in 
a gas that we set out to discuss in 
Chapter 1 . 



3.1 THE BINOMIAL DISTRIBUTION: 

A SYMMETRIC COIN 

Ve assume that the probability of 
coming up heads is equals to the proba- 
bility of tails: 

p(H) - p(T) . 

A result is certain to be either heads 
or tails, so that for a perfectly bal- 
anced coin, 

p(H) + p(T) - 1; p(H) - p(T) - 1/2. 

(3.1) 

For a sequence of several tosses, our 
primary rule is that each toss is an 
independent event, unaffected by pre- 
ceding results, and always with the 
same probabilities, no matter how long 
we play. 

Yet, in a sequence of throws, we 
know that it is most likely that heads 
and tails will appear about equally 
often, and that it is quite unlikely 
that only heads will come up. The rea- 



son for this is not that the coin or 
luck eventually makes a correction, but 
is simply a result of the independence 
of probabilities of each toss. For in- 
stance, let us suppose that there has 
been a sequence of tosses, perhaps 100 
heads in a row, and that the probabil- 
ity of this occurrence is some (small!) 
number p(100H) . Now, when the next toss 
is made, the probability of one more 
head is 1/2, just as for a single toss. 
This means that , if we have been so 
lucky (or unlucky) as to get 100 heads 
in a row, we have only one chance in 
two of extending our string one more 
toss, to 101 in a row. We can then 
write that the probability for 101 
heads is 

p(101H) - p(H) p(100H) - Jp(100H) . 

Now, if the toss is a head, we know 
what the chance is of extending the 
string to 102: 

p(102H) - p(H) p(101H) - Jp(100H) 

Thus, we see that the chance of extend- 
ing the string of successes decreases 
with each additional success, by the 
same factor each time. The improbabil- 
ity of a long string is just the result 
of the repeated product. Therefore, the 
probability of 100 heads in a row is 
the repeated product of 100 factors of 
$ or 
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p(100H) - (i) 



Of course, the probability of 100 tails 
in a row is the same. Moreover, since 
the probability of a tail on any single 
toss is precisely the same as the prob- 
ability of a head, we get the same 
chance for a string of 99 heads in a 
row, and then one tail, as for 100 
heads. Or of 23 heads, a tail, and 76 
heads. Or of 23, heads, and then 77 
tails. Or even of an alternating set of 
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heads and tails, adding up to 50 heads 
and 50 tails! This seems wrong, until 
we realize that we are specifying pre- 
cise results for individual tosses, in- 
stead of simply letting the vagaries of 
chance work things out in the long run. 
When we relax our conditions, and let 
heads or tails come up on any individ- 
ual toss, we find that the string is 
likely to have about 50 heads and 50 
tails. But we have to include runs in 
which we may get clusters of several 
heads or tails in a row, as well as al- 
ternating sides, and also strings in 
which there are first 50 heads and the 
rest tails. We begin to see that the 
dominance of the average behavior is 
that there are many ways of reaching 
it , as long as we do not specify the 
detailed route. 

For N tosses, each distinct se- 
quence has probability (1/2)^* . The gen- 
eration of each sequence can be seen as 
a branching diagram, as the Fig. 3.1 
below: At each junction, the chance of 
following one of the two branches is 
1/2, so that the chance of taking one 
particular route containing N junctions 
is (1/2) 11 . How many different rqutes 
are there for a diagram of N junctions? 
Since each junction has two branches, 
there are simply 2 " different routes. 
Since the probability of each route is 
(1/2)* , the sum of probabilities of all 
routes is therefore 2* (1/2)* • 1, as it 
should be: If N junctions have been 
passed, it is certain that we have 




FIr. 3.1 Alternative eequeneee of the re- 
sults of tosses of a coin. 



takon one of the possible routes to the 
end . 

Usually, we are not interested in 
the order in which heads or tails ap- 
peared , but simply in how many there 
wore in the whole sequence. If we put 
the different sequences into such 
classes, a distribution begins to 
emerge, with some classes much more 
popular than others: examples of such 
distributions are shown in Fig. 3.2. If 
the number of tosses is very small, we 
can sort the sequences into classes by 
preparing tables, but tables are im- 
practical for longer runs - the number 
of entries for N - 20 would be 20 (2) 8 ® 

- 21,171,520. Clearly, some shortcuts 
are necessary. Fortunately, they are 
not hard to develop. Suppose we wish to 
determine how many different sequences 
of N tosses contain n heads and m tails 
in any order. To begin, we think of 
each H as labeled with a distinguishing 
mark. Let them be numbered H x , H 2 , ... 
H n ; they may each appear on any one of 
the N tosses not already "filled" by 
another H. After all the H results are 
distributed in a sequence, the remain- 
ing (N — n) tosses are occupied by the 
m T's that remain: m and (N — n) are 
the same . 

Beginning with the first, we can 
imagine H t in any of N places. Since 
one space is filled, there are only 
(N — 1) openings for H 2 , and progres- 
sively fewer for each additional one, 
down to (N-n+1) for H D . All together, 




Fig. 3.2 Finding maxima and minima of func- 
tions by differentiation. 
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the total number of combinations is the 
continued product 

N(N — 1)(N — P) ... (N — n + 1) . 

A convenient symbol for a contin- 
ued product is the factorial, defined 
for an integer z by 

z ! - z(z — 1) (z — 2) ... (3) (2) (1) , 

with 0! - 1. The number of different 
arrangements of n distinguishable H in 
N tosses can then be written as 
N!/(N — n) ! . But in an actual sequence 
the different H are not distinguish- 
able, and there is no difference be- 
tween a run such as 

H, T T H, T Hj . . . 
and another of the form 

H, T T Hj T H 2 . . . 

The factor by which we have overcounted 
is the number of different orderings we 
can make of n distinguishable things: 
it is simply n!. Therefore, the number 
of different sequences of n identical H • 
in N tosses is N'./(N - n)'.nl. This num- 
ber is the statistical weight w(nH, mT) 
of the category of n heads and m tails 
in N ■ n + m tosses: It is an important 
result , and we rewrite it in more sym- 
metric form: 

w(nH,mT) - - 7^7 (3.2) 

n .m • 

Since the probability of every differ- 
ent sequence is [p(H)] 11 , the probabil- 
ity of getting any member of the class • 
(nH, mT) is 

p(nH, mT) - w(nH, mT)[p(H)J. (3*3) 

The idea of a statistical weight is 
useful in physics, when it is practi- 
cally impossible to distinguish indi- 
vidual arrangements of microscopic 
quantities, and only those properties 
characteristic of classes of arrange- 
ments night be determined. 
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The graphs in Fig. 2.1 were drawn * 
according to Eqs. (3.2) and (3.3). Ad- 
ditional properties of the symmetric 
binomial distribution are described in 
the next section. 

Among the most interesting fea- 
tures to explore are the expected num- 
ber of heads and the dispersion. We ex- 
pect that the distribution should have 
a shape that is a progressive develop- 
ment from the discrete graphs of Fig. 
2 . 1 , with steps becoming less distinct 
and the whole curve narrowing as N in- 
creases. Instead of graphing a succes- 
sion of curves, we can discover these 
features by analysis. 

For numbers larger than 10 the 
factorial is tedious to calculate; 
(however, there are published tables of 
z! for 0 > z > 1000). In statistical 
physics it is almost always suffi- 
ciently accurate to use an approximate 
expression for z! when z > 10. Stir- 
ling's approximation is 

z: - (2»z) i (|) 1 , (3.4) 

where e is the base of the natural log- 
arithms. The formula is accurate to 
within (9/z) percent. Using Stirling's 
approximation for the factorials in Eq. 
( 3 . 2 ), we obtain a more convenient form 
for the statistical weight of the class 
of n heads and m tails: 

w(nH, mT) - ( 2 ^-) ^5 (3-5) 

It is canvenient to write the ex- 
pression for statistical weight in 
terms of a single parameter for the 
fraction of heads in the total number 
of trials. Defining the ratio r - n/N, 
then (1 - r) - m/N, and substituting in 
Eq. (3.5), the statistical weight can 
be expressed as a function of r and N 
only , 

w(r,N)-[2ffNr(l — r)] r mQ _ r ^i--r)B * 

(3.6) 

To locate the value of r corre- 
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sponding to the greatest statistical 
weight, we search for the region where 
a small change in r does not affect the 
result. It is the same principle that 
one could use to locate the peak of a 
physical mountain, by finding a spot 
where a little motion forward or back- 
ward would not cause any change in 
one's altitude. Of course, the tech- 
nique would give all the intermediate 
peaks and valley floors as well, so it 
must be used in conjunction with a few 
rough observations in order to select 
the highest value from among all the 
flat spots. But it should give us lit- 
tle trouble in the present application, 
since we expect to find only one peak, 
with steadily falling values on either 
side. 

When N is sufficiently large, the 
variable r can be treated as continuous 
rather than discrete. Instead of trying 
to locate the peak by the method of fi- 
nite differences, we can then use the 
differential calculus, which is much 
easier (as long as one knows differen- 
tial calculus!). For any continuous and 
differentiable function y - f(x), the 
extrema (maxima and minima) of y corre- 
spond to places where the derivatives 
of y with respect to x are zero: There 
is an extremum at x - x' if 




Instead of working with the probabil- 
ity directly, it is much easier in the 
case of large exponents to work with 
the logarithm of the probability. In 
this way we will be able to separate 
the individual terms and to pick out 
those that are most important . The 
habit of working with logarithms of 
functions rather than the functions 
themselves is characteristic of statis- 
tical physics; it will come up again. 
Since In p is a monotonic function of 
p, the extrema of In p occur at the 
same positions as the extrema of p it- 
self . 

Remembering that In (ab) ■ In a 
+ In b, In (a/b) - In a — In b, and 
In (a) b - b In a, taking the logarithm 
of both sides of Eq. (3.6) gives us 

In w(r,N) ■ -i In r - J In (1 — r) 

— r N In r — (1 — r) N In (1 — r) 

+ N In p(H) . 

Differentiating with respect to r, and 
using d/dx (In x) <■ 1/x, we obtain 

£ [Id P<r,H>] - 5 [ ( i i r , ~ ?] 



In the Fig. 3.1, there are maxima at 
x - x l and x s , and minima at x - 0, x 2 , 
and x 4 . By differentiating f(x) and 
setting it to zero, we should find all 
of these values as roots of the equa- 
tion. But we must inspect y itself to 
determine that the root x - x x is the 
one that yields the largest value of y. 

Applying the technique to Eq. 

(3.6) we can find r m , the most proba- 
ble value of r, which corresponds to 
the most frequently appearing fraction 
of heads. We could follow the prescrip- 
tion by differentiating p(r,N), direct- 
ly, as 



d£ 

dr 



(r,a) 



dwOMO 



dr 



[p(H)j" 



0 , 



for the case p(H) B p(T) . 



If N is very large the second term is 
dominant in the intermediate range of r. 
In this region the only point where 

» ln [— 7 r) ] ■ 0 

is that for which (1 — r)/r *= 1. Thus, 
r m « J. The result is independent of N: 
The most probable result is that heads 
comes up half the time in any length of * 
sequence . 

Sometimes the average value in a 
distribution is not the same as the 
most probable value. They will be dif- 
ferent when the values are not distrib- 
uted symmetrically about the peak of 
the curve (the distribution is skewed) . 
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For example. If In a group of 100 test 
papers there are 10 oerfect scores but 
the rest are distributed in steps of 1 
from 1 to 90, the most probable score 
is 100, but the average is only 50.95. 
The average value of x, which is usu- 
ally written (x) or x, can be found 
from the probability distribution by 
remembering that the probability is 
equal to the fraction of times that a 
result appears. The average is formed 
by totaling all of the values that ap- 
pear and then dividing by the number of 
entries. Thus, if there are N trials, a 
probability p(x 4 ) means that N p(x 4 ) 
results gave the value x 4 . These add 
ti»« contribution x 4 N p(x t ) to the toy 
tal. The average is then 

(x) - jjJxjHptxj) + x 2 Np(x a ) + ... 
x,Hp(x„>] 

1=1 1=1 

(3.7) 

if the probability varies little 
from one value of x to the next, so 
that p(x 1 + 1 ) - p(x A ) « p(x A ), we may 
treat x as a continuous variable. In 
this case, the probability dp(x) that 
the result lies between x and x + dx is 
proportional to the Interval dx: 

dp(x) - f(x)ax, (3.8) 

f(x) is known as the probability den- 
sity. It is a normalized function , 
since the probability is unity that x 
has some value out of the total possi- 
ble range: 

Jdp(x) - 1 - /_* f Mdx 0.9) 

Averages of functions of variables 
are often of interest. For example, if 
results having large values are partic- 
ularly important, we may want to devel- 
op some property of the distribution 
which emphasizes the large results wore 
than the smaller ones. If a high school 



curriculum is designed 'to stimulate the 
few unusually talented students, their 
grades may be improved to very high 
levels, but they would make little dif- 
ference to the class average. We could 
emphasize the appearance of their high 
scores by averaging not the scores 
themselves but the squares of the 
scores, or even higher powers. The pre- 
scription for averaging the squares of 
quantities is formally the same as for 
the quantities themselves. 

<x*> - E x i 2p(x i ) <3 - 10) 

1=1 

for discrete values of x, and 

(* 2 > - /_”=* *(*)«* 

for continuous values. 

For any integrable function g(x), the 
prescription is 

' (g(x)> “ f_2 g(x)f(x)dx. (3.11) 

A very interesting feature of the aver- 
aging process is that the average of a 
function is not identical to the func- 
tion of the average: they are noncom- 
muting operations in general . A simple 
example is x*, and the set of numbers 
in Table 3-1 illustrates the inequality 

<**> *<*>'• 





X 


X* 




-3 


9 




-2 


4 




-1 


1 




0 


0 




1 


1 




2 


4 




3 


9 


TOTALS: 


0 


28 


AVERAGES: 


<x>=0 


<**> = < 



Table 3.1 An illustration of the noncomr 
mutability of the operations of averaging 
and o 2 malting an arbitrary function of a 
variable. The example shows that <x>* * <**>• 
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n-9,p-2 

.4 



n - 9, p - .5 



n-l.p-4 





The relative widths of distribu- 
tions can be expressed in teras of (x a ) 
and (x) 2 . The dispersion a is defined 
by the relation 

o 2 - <x 2 > - <x> 2 . <3.12) 

It can also be written in the fora 

V* - <<* - <x)). 2 ) . (3.13) 



n — 9, p «■ .2 
.4 



n ■ 16, p » 2 



n-25,p-.2 



.2 



0 




0 2 4 6 8 10 





0 2 4 6 8 10 



Fig. 3.3 Binoaial distributions for various 
values of I and p. After G. P. Wadsworth and 
J. G. Bryan, Introduction to Probability and 
Bandas Variables (McGraw-Hill Book Coapany, 
Hew York, I960). 



For the blnonial distribution, the dis- 
persion can be calculated fron the dis- 
crete finite series. For a syaaetric 
coin, the result is that the dispersion 
in the nuaber of heads appearing is se- 
quences of H tosses is 

a. - ii®, 

showing that the aagnitude of the devi- 
ation froa an equal nuaber of heads and 
tails increases as the sequence gets 
longer. But the deviations do not in- 
crease as rapidly as the sequences 
theaselves. The relative deviation of n 
decreases : if we express the result in 
teras of the ratio r - n/N, the disper- 
sion in r is 




Or - (r*> - < r >* ” 

Hence e, - 

showing that the relative distribution 
gets narrower as N Increases. In the 
liait of an infinite nuaber of trials, 
the curve has no width at all, giving 
the precise result r - In the liait, 
we return to the way we defined the 
probability for each toss, by assualng 
an infinite nuaber of trials. 



..iilllliii. ..iiilllln 

0246810 0 2 4 6 810 12 1418 18 20 

Fig. 3.4 Poisson distribution for various 
values of fi. After G. P. Wadsworth and J. 0. 
Bryan (op. dt .). 



3 .2 OTHER DISTRIBUTIONS 

3.2.1 The Skew Blnonial Distribution : 
A "Plugged Nickel » 

If the probabilities of the two 
possible results A and B of a trial 
are different, the blnoaial distribu- 
tion is no longer syaaetric. Suppose 
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that the probabilities are p for the A 
result and q for the B, with p + q - 1. 
By a staple extension of the develop- 
aent of the syanetrlc blnoalal distri- 
bution , we find that the probability 
for n results of type A in a sequence 
of N trials is 

’<“> - ^«rhoT (314> 

The distribution is skewed for all 
cases in which p # q, as seen in sev- 
eral cases graphed in Fig. 3.3. 



3.2.2 The Multlnoalal Distribution 





1 BINOMIAL 


POISSON 


0 


.3575 


.3670 


1 


.3682 


.3670 


2 


.1841 


.1830 


3 


.0612 


.0613 


4 


.0153 


.0153 


5 


.00303 


.00307 


6 


.00050 


.00051 




1 .000071 


.000073 



Table 3.2 Comparison of binomial probabil- 
ities for p - 1/500 and N - 500, with Pois- 
son probabilities for /i - Np - 1. 



If there are more than two nutu- 
ally exclusive results possible froa a 
single trial, the probabilities of coa- 
blnatlons are descrlbable by a aulti- 
noalal distribution. Let the probabil- 
ity for a type 1 result be p, , for a 
type 2 result be p 2 , and so forth, the 
total nuaber of different results being 
p. By an extension of the developaent 
of the blnoalal distribution, the prob- 
ability for n 4 results of type 1, n 2 of 
type 2, etc., to occur in a sequence of 
M trials is 

p(® 1 » ••• ®i>) 




n„! 






where 



(3.15) 



Pi + Pa» + » • • • Pv m 1 . 



3.2.3 The Poisson Distribution 

This is a Halting fora of the 
blnoalal under certain conditions. The 
conditions are that p — 0 and N — «*> in 
such a aanner that the product Np tends 
to a constant Halt M. The Poisson dis- 
tribution for n successes is 

p(n) - e“ M (3.16) 

n; 



skewed blnoalal distributions in Fig. 
3.3. The correspondence between the 
Poisson and blnoalal distribution in 
the region of saall p and large N is 
also evident in Table 3.2, which com- 
pares the two for the case of p - 1/500 
and N - 500. 



3.2.4 The Gaussian Distribution 

The Gaussian distribution is also a 
Halting fora of the blnoalal: It is 
appropriate for the case of large N, 
such that the valu es of re sults cun be 
treated as a continuous variable x. In 
teras of the pr oduct y. - Np and the 
dispersion a - VNpq, the probability 




Several examples of Poisson distribu- 
tions are shown in Fig. 3.4; notice the 
similarities between Fig. 3.4 and the 



Fig. 3.5 Gaussian distributions for y - 0 
and various values of o. After G. P. Wads- 
worth sad J. G. Bryan ( op. clt.) . 
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that the number of results lies between 
z and x + dx Is 



dp(x) - f(x) dx 



W 2ir 



exp 






(3.17) 



The Gaussian distribution Is usually a 
satisfactory approximation for the bi- 
nomial for Np > 5. It also approximates 
the Poisson distribution when / tx Is suf- 
ficiently large. The Gaussian distribu- 
tion is particularly useful because it 
has convenient mathematical properties; 
it is generally used in place of the 
Poisson and binomial distributions whe 
whenever it is reasonably accurate to 
do so. In Fig. 3.5 several samples are 
shown for fixed /x and various o. The 
dispersion controls the width, while 4 
controls the position of the curve: 
Varying fi would cause the sample curves 
to slide to higher or lower x, but with 
no change in shape. The formula for the 
dispersion In the Gaussian distribution 



; 



is the same as that used in the discus- 
sion of the binomial distribution, that 
is, the value of (x 2 ) — (x) 2 for the 
Gaussian distribution is o 2 . 

The probability density for any 
distribution is normalized, according 
to Eq. (3.9). For the t Gaussian distri- 
bution we therefore have 



/ °° exp [ - — 2<y 2 )2 ] * aV2*. (3.18) 

Averages of x and x 2 for the Gaussian 
distributions are given by the definite 
Integrals : 



<x> ‘ wb C x e * p [ - 0t 2</ ) * ] d * ' * 



- o 2 + H 2 . 



(3.19) 



These averages will be useful in Chap- 
ter 4. 
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POSITIONS AND VELOCITIES 
OF HOLECU L'U S OF A GAS 



The statistics of pure chance can be 
translated into the statistics of sev- 
eral physical systems. One of the sim- 
* plest examples concerns the distribu- 
tion of molecules in a gas. If we be- 
gin with a question about the distri- 
bution in space, the statistics of a 
symmetric coin can be applied immedi- 
ately to its solution. Increasingly 
sophisticated questions about the 
same molecules can be answered by us- 
ing distribution functions derived 
from the binomial law. The exploration 
will lead us into the elementary kine- 
tic theory of gases. 

4.1 THE DISTRIBUTION IN SPACE 

Suppose that we have a quantity of 
gas sealed in a container. How uni- 
formly do the atoms arrange themselves 
throughout the volume? To make the 
problem as simple as possible at first, 
we will consider the container divided 
into two equal volumes, A and B. Ve 
imagine the atoms to be in random mo- 
tion throughout the total volume, each 
particle moving back and forth in an 
irregular path of straight lines by a 
succession of reflections with the con- 
tainer walls. The roughness of the 
walls causes the path to vary endlessly 
so that an atom threads its way through 
the container in every direction, even- 
tually coming arbitrarily close to any 
point, from all angles. No portion of 
the volume is avoided or preferred (ex- 
cept perhaps for the regions close to 
the walls), and therefore, the chance 
of finding the atom within a specified 
portion of the container is propor- 
tional to that fraction of the total 
volume, independent of the location or 
shape of the portion. Therefore, if the 
volume of the container is divided into 
) two equal portions of any shape, the 
chance that a specified atom is in 
either portion is 1/2. This makes the 







probabilities of the atomic locations 
in the two regions equivalent to the 
results of the tosses of a symmetric 
coin. To pursue the analogy, we must 
guarantee the Independence of proba- 
bilities of location of the individual 
atoms, corresponding to the independ- 
ence of results of Individual tosses. 

To effect this correspondence, we as- 
sume that the density of atoms is suf- 
ficiently low, so that collisions be- 
tween them are infrequent : Each atom 
is rarely disturbed by an encounter 
with another. Although in this devel- 
opment we imagine atoms as if they only 
interacted by bumping, real atoms or 
molecules sometimes have long-range in- 
teractions. When two real molecules are 
within their range of interaction, 
their positions and velocities are not 
completely Independent : They are found 
to be correlated . In the present exam- 
ple we consider the average separation 
to be much greater than the range of 
interaction, so that their positions 
and velocities are uncorrelated. There- 
fore, if two atoms are in the whole 
container, the probability that both 
are simultaneously in the A half is 
1/4. The chance of finding one (either 
one), in A and the other in B is 1/2. 
When N atoms are enclosed, the chance 
that n are in A is evidently the same 
as the chance that n out of N tosses 
of a symmetric coin will be heads, and 
it is given by the binomial distribu- 
tion. 

In Chapter 3 some properties of 
the binomial distributions are ana- 
lyzed, and the results are useful here. 
They show that the average atomic ar- 
rangement will be symmetric for any nu 
number of molecules: 1/2 in A and 1/2 
in B. At any moment, the arrangement 
may be different from the average, the 
chance of a certain deviation being 
equal to that of the same numerical de- 
via tion being equal to that of the same 
numerical deviation in a sequence of 
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tosses. But the gas is not frozen in 
its distribution, since the individual 
atoms continue to travel through the 
container, and the distribution changes 
continually. The time for an appreci- 
able change to occur is difficult to 
calculate with any precision, but we 
can readily estimate its order of mag- 
nitude. 

If the container walls are micro- 
scopically rough, as almost all real 
walls are, atomic reflections are dif- 
fuse, or random in direction. By such 
reflections, the spatial distribution 
of all the a tons is made nearly inde- 
pendent of the previous arrangement as 
soon as each atom has time to make one 
collision with a wall. Therefore, the 
"lifetime" of a particular arrangement 
in a volume is L/ ( v) , where L is a 
characteristic length of the container, 
and <v) is the average speed of an 
atom. If instead of concerning our- 
selves with the arrangement relative to 
the two halves of V, we ask about the 
distribution in a smaller part of the 
volume, the lifetime will be smaller, 
corresponding to the dimensions of that 
small part. The lifetimes for the ar- 
rangements of common molecules at ordi- 
nary temperatures in laboratory -sized 
volumes are very short. Borrowing a re- 
sult from a later section of the chap- 
ter, (v) for low-mass molecules is on 
the order of 1000 m/sec at room temper- 
ature. Therefore, the time for a rear- 
rangement to occur in a 1-m cube is 
about a millisecond. For this volume, 
the system passes through a succession 
of many arrangements rapidly. However, 
the corresponding lifetime of a spatial 
distribution in a galaxy of 100,000 
light-years in diameter is about 3 
x 10 10 years. Hence, for small volumes, 
a series of observations lasting a sec- 
ond may be sufficient in averaging the 
system over many arrangements, but for 
most astronomical volumes we can only 
examine a single arrangement even if we 
observe over the course of many years. 

If cither the astronomical or the 
laboratory-sized samples of gas are in 
an abnormal arrangement at a given mo- 
ment, perhaps because of some external 



/ 

influence which has been suddenly re- 
moved, they will each lose all memory 
of that situation in a few character- 
istic rearrangement times, after which 
the positions will be in statistical 
equilibrium , moving from one arrange- 
ment to another according to their rel- 
ative statistical weights. The time for 
this system of independent particles to 
relax from an unusual situation caused 
by a change in its external environment 
is therefore about the same as the time 
for new arrangements to occur under 
steady conditions. The equilibrium time 
is not always identical to the time for 
«.' major reorganization for all types of 
systems — the relaxation time and the 
redistribution time are comparable in 
this example because the walls are as- 
sumed to be perfectly rough, the mole- 
cules rarely collide, and also because 
we are now considering only their ar- 
rangements in space . 

• Although every imaginable arrange- 
ment is possible, some are much more 
likely than others, having greater sta- 
tistical weights. From the results of 
Chapter 3, we see that the atoms dis- 
tribute themselves more perfectly be- 
tween the two halves as their numbers 
increase. Specifically, the percentage 
mean deviation from an equal division 
varies as 1/ ViT. For 1 cubic meter of 
gas at the density of interstellar 
space, which is about 1 atom/cm 3 , the 
mean deviation is about 0.1 percent. 

But for 1 cubic meter of air at sea- 
level pressure, in which there are 
about v* 10 2 2 molecules, the mean de- 
viation -3 about 10- percent. 

We need not limit ourselves to the 
case of equal subdivisions of the con- 
tainer. The skewed binomial distribu- 
tion for an asymmetric coin is appro- 
priate to the arrangements of molecules 
between two unequal parts of the vol- 
ume. It is described in Chapter 3, and 
predicts that the average division of 
atoms is proportional to the ratio of 
volumes, according to the a priori 
probabilities for individual particles. 
As in the example of two equal por- 
tions, u 're perfection of the ratio of 
the poi 4 t ions in two unequal portions 
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is also improved as the total iuis’jof of 
particles increases. But if one of the 
portions becomes so small that the 
average number of particles it contains 
is much smaller than the total, then 
its population will fluctuate by a 
larger percentage. If, for instance, 
the sample is so small that the aver- 
age number it contains is only one 
atom, it will be subject to fluctua- 
tions comparable to the average itself. 
The larger volume, on the other hand, 
will not fluctuate at r.ll, since it 
contains essentially all of the parti- 
cles, a constant number. Under these 
conditions, the numbers of particles 
appearing in the small volume will fol- 
low a Poisson distribution. The mathe- 
matical form of the Poisson distribu- 
tion is described in Chapter 3 for the 
example of a very asymmetric coin, in 
the limit of a very small probability 
of heads and very long sequences. For 
our physical example, the probability 
of an atom's appearance in the small 
sample volume is equivalent to the 
small chance of heads, and the large 
total number of atoms is equivalent to 
the large total number of tosses. In 
terms of the average number (n) of par- 
ticles per unit volume of the whole co 
container, the probability for finding 
exactly n particles per unit volume of 
the small sample is, after Sq. (3.16), 

p(n) . W £ V . (4.1) 

r n! 

Some examples of the Poisson distribu- 
tion are graphed in Fig. 3.4. The first 
example, with /x - 1, corresponds to the 
physical example (n) - 1 discussed 
above. It shows that there is just as 
much chance for finding 0 particles as 
for 1 (e~* for each), half as much for 
2, etc. There is thus a very wide scat- 
ter in the percentage deviation. For 
larger (n) or V the absolute width of 
the distribution increases, but the 
width relative to the total decreases: 
In the last example jx - 10, and here 
the mean deviation is reduced to about 
25 percent of the average value. In 
this case, the distribution is nearly 
symmetric, and begins to resemble a 
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Gaussian distribution. The approxima- ' 
tion becomes quite accurate by the time 
/x b*»s increased to 50; in Table 4.1 the 
binomial, Poisson, and Gaussian distri- 
butions are compared for this situation* 
Because its mathematical properties are 
more convenient for calculations, we 
prefer to use the Gaussian form when it 
is appropriate. The Gaussian distribu- 
tion is a good approximation when n is 
large enough to be treated as a contin- 
uous variable. Translated into terms 
convenient for the physical case, the 
Eq. (3.17) for the Gaussian distribu- 
tion can be translated to the probabil- 
ity f(n)dn that the number of particles 
per unit volume lies between n and 
n + dn, 

,<n)dn “ BXP [" <D 2a»" > >2 ] dn - 

(4.2) 

We have been considering the dis- 
tribution of molecules as a fluctuating 
population in a sample small "cell" of 
the whole container, a distribution in 
time. We can also consider the distri- 
bution from the standpoint of varia- 
tions in space. In the equilibrium 
state every small region of the con- 
tainer is as probable a location for an 
atom as any other cell of the same 
size. Therefore, there is nothing to 
distinguish the fluctuations in one 



n 


BINOMIAL 


POISSON 


GAUSSIAN 


25 


.0000 


.0001 


.0000 


30 


.0006 


.0010 


.0007 


35 


.0052 


.0057 


.0054 


40 


.0212 


.0205 


.0215 


45* 


.0460 


.0442 


.0458 


50 


.0569 


.0570 


.0563 


55 


.0424 


.0442 


.0422 


60 


.0199 


.0205 


.0201 


65 


.0061 


.0057 


.0063 


70 


.0013 


.0010 


.0014 


75 


.0002 


.0001 


.0002 



Table 4.1 Comparison of Gaussian and Pois- 
son approximations to the binomial probabil- 
ity, when N “ 2500 and p - 0.02. 
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cell from those in another, and any 
cell displays arrangements similar to 
those expected for any other cell, ex- 
cept that they will occur at different 
times. Therefore, a description of the 
instantaneous atomic populations of 
many cells is equivalent to the de- 
scription of successive stages of a 
single cell. Interpreted in this 'fash- 

a 

Px 



X 




C Px 










1 X 







Fig. 4.1 Bfcths in phase apace of some dy- 
namical systems; (a) a free particle; (b) a 
falling object; (c) a one-dimensional har- 
monic oscillator. 



/ 

ion, Eqs. (4.1) and (4.2) are descrip- 
tions of the distribution in space. 



4.2 MOLECULAR VELOCITIES 

By using a combination of physical 
arguments and the methods of statisti- 
cal analysis, one can obtain the form 
of the distribution law for atomic ve- 
locities. To make the analysis as sim- 
ple as possible, we assume that the 
atoms have no properties other than 
their (identical) mass, so that a com- 
plete description of the state of a sin- 
gle atom can be given by specifying its 
3 position coordinates; x, y, z in Car- 
tesian space, and its 3 velocity or mo- 
mentum coordinates; v x , v y , v x or p x - 
mv x , Py = mVy; p x = mV x • Although it 
simplifies the analysis to consider the 
atoms as if they possessed no internal 
variables, the resulting distribution 
of velocities will not depend on this 
restriction, and it will be possible at 
a later stage to expand the treatment 
to include other atomic properties in 
the statistical description. 

In addition to the assumption of 
point masses , we assume that the gas is 
isotropic, so that the motions look the 
same when viewed from any direction. 

The gas can be isotropic in only a sta- 
tistical sense, of course, since any 
particular trajectory has a definite 
direction in space. But - series of 
observations of many trajectories is 
assumed to show no tendency to prefer 
or avoid certain directions, or to have 
speeds which are in any way correlated 
with direction. Since all directions 
are equivalent, we are able to analyze 
the three-dimensional motion by study- 
ing at first only the component motions 
parallel to a single direction and la- 
ter to combine the Independent distribu- 
tions along each direction to make a 
complete three-dimensional description. 



4.3 PHASE SPACE 

The dynamical behavior of a point- 
mass molecule can be described by glv- 
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ing the history of its 3 position and 
3 momentum coordinates. As the 6 vari- 
ables change with time, the particle 
can be imagined to move along a path in 
a six-dimensional hyperspace, called 
phase space . Certain properties of the 
paths in phase space, those which are 
Important to the theory of their sta- 
tistical behavior, can be obtained ffom 
their geometry alone, ignoring their 
behavior as a function of time. In or- 
der to become familiar with the concept 
of phase space, we will consider some 
examples of motion in one dimension, 
so that their paths can be drawn as 
two-dimensional graphs. 

A particle moves along the x di- , 
rection at constant speed. Therefore p, 
is constant; x varies. Its graph is 
shown in Fig. 4.1a. An object falls 
from rest, with acceleration g. Its mo- 
mentum p y “ mgt , and its displacement 
y - £gt 2 . Therefore, its phase space 
trajectory is the parabola y - p y 2 /2m 2 g, 
shown in Fig. 4.1b. A mass is attached 
to a spring, and executes simple har- 
monic oscillations according to the 
equation 

x - x 0 sin wt ; p* • mcox 0 cos u>t . 

We can obtain the equation for the cor- 
responding path in phase space by 
squaring each equation above and using 
the trigonometric equality. 

sin 2 6 + cos 2 0—1. 

Carrying out these operations, we ob- 
tain 

(meax) 2 + p x 2 - (mwx 0 ) 2 . 

This graph of this equation is an el- 
lipse, shown in Fig. 4.1c. 

The importance of the concept of 
phase space arises from the very power- 
ful technique of analysis of dynamical 
systems by means of Hamilton's equa- 
tions. In the Hamiltonian method, coor- 
dinates and momenta are in a symmetri- 
cal relation with respect to one an- 
other; x and p x , y and p y , etc. These 
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Fig. 4.2 One dimensional motion of a "free’' 
particle confined in a box having perfectly 
reflecting walls. 



pairs of variables are said to be can- 
onically conjugate to each other, and 
they play symmetric roles in Hamilton's 
equations of motion. 

We are preparing to use the con- 
cept of phase space to analyze finite 
volumes of gases. Therefore, we will 
modify Fig. 4.1a by confining the par- 
ticle to a one-dimensional box, so that 
it only travels between limits, being 
reflected at the walls. Now, the path 
is seen in Fig. 4.2: It has become a 
pair of horizontal lines, one at +p* as 
it moves to the right, and one at — p* 
as it moves to the left at the same 
speed. The two horizontal lines are 
connected by vertical lines at the 
walls, where the reflections are as- 
sumed to take place at precise posi- 
tions. For real particles and real 
walls which are slightly compressible, 
a reflection takes place over a small 
but nonzero distance, and therefore, 
the actual trajectory would have 
slightly rounded corners. In any event, 
the path is closed. 

In the one-dimensional example, 
the motion of a particle in a box sim- 
ply repeats in each transit the motion 
of the previous cycle, but in two- and 
three-dimensional motion the path is 
not repeated each time. This is because 
the total energy is shared among the 
three components in the different di- 
rections, and the component magnitudes 
will generally vary from one transit to 
the next. In three dimensions, the par- 
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tide moving at constant speed has the 
total constant energy 

€ ~ 2 m (p * 2 + P * 2 + P * 2 ** (4 ‘ 3 * 

As the collisions with the rough walls 
cause the particle to be reflected at 
in various directions, the magnitudes 
of the component momenta p x , p y , p z 
vary. If all directions are equally 
probable, there are no restrictions on 
the individual magnitudes as long as 
the condition of constant energy is 
maintained. This corresponds to the 
range for each component, 

-V2m<F < p t < +V2mc ; p A - p x , p y , p*. 

Let us now imagine many identical 
particles moving in the volume. If the 
atoms rarely collide with each other, 
each one will move very nearly as the 
solitary atom, tracing out an irregu- 
lar path in the six-dimensional phase 
space. In the case of many atoms, how- 
ever, each particle has a very much 
larger range of momenta than if there 
were no collisions. In a typical colli- 
sion between two particles one will be 
caused to speed up and the other to 
slow down. The result of a particular 
collision depends on details of the 
actual cross sections and directions of 
approach, but we do not need to con- 
sider these details in our statistical 
analysis . 

The momentum available to a single 
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Fig. 4.3 Graph of the instantaneous posi- 
tions and momenta of a weakly interacting 
gas in statistical equilibrium. 
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atom now extends up to such a value 
that the total energy U of the gas is 
contained in a single particle. The 
limits of momentum in each dimension 
are thus 

— V2mU < p x ^ V2mU . 

If there are N identical particles, 
each one is subject to the same re- 
striction, and each roams over the 
phase space of larger extent . But they 
do not do so independently, since all 
of their energies must add up to U at 
all times. Therefore, the panicles can 
be represented as N poi-.ls moving 
through the six-dimensior^l space, 

but restricted by tl.e conservation of 
the total energy to a surface of five 
dimensions. Projected upon a plane sur- 
face in p x and x, their momentary posi- 
tions in phase space might resemble the 
graph in Fig. 4.3. If this drawing were 
an illustration of an arrangement in 
coordinate space, it would indicate the 
variations in density of the molecules, 
i.e., their distribution in space. But 
in Fig. 4.3, we see a distribution in 
phase space , which contains aspects of 
a distribution in coordinate space com- 
bined with a distribution in momentum 
space. 



4.4 DISTRIBUTION IN PHASE SPACE 

In order to analyze the distribu- 
tion in phase space by statistical 
methods, we will assume that the space 
is divided into many small cells of 
equal "volume" Ap x • Ax. We make the 
crucial assumption that each cell has 
an equal probability of occupation - 
this assumed uniformity of phase space 
is analogous to the uniformity of coor- 
dinate space in the previous section. 
The condition of uniform probability in 
phase space, which is the fundamental 
hypothesis of statistical mechanics, 
and which is applicable to all identi- 
cal systems in statistical equilibrium, 
is discussed in considerable detail in 
advanced works on statistical mechan- 
ics. It rests upon a combination of 
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physical and mathematical arguments 
that are beyond the level of this mono- 
graph. We shall therefore simply adopt 
it without any further justification. 

We also assume that the probabil- 
ities are unchanged by the numbers of 
atoms already in tha cell . This assump- 
tion of the independence of probability 
on cell population is not correct, but 
is only an approximation which works 
when the density in phase space is 
small (in comparison to the reciprocal 
of Planck’s constant h). The approxi- 
mation is equivalent to the assumption 
of classical statistical behavior, 
which is generally valid for gases and 
other systems at not-too-low tempera-, 
tures. Under these assumed conditions, 
if the probability of occupation by a 
single molecule in any cell is g, the 
probability of occupation by N mole- 
cules is g N . 

For convenience in describing the 
arrangements of molecules among the 
cells, wo designate the cells by num- 
bering them in sequence, and list their 
populations thus: n t particles in cell 
1, n 2 in cell 2, ... n v in v. If the 
molecules are indistinguishable, the 
probability for a distinct arrangement 
is given by a multinomial distribution, 
as in Chapter 3: 

p(n t , n 2 , . . . nj,) ■ w(n 4 , n 2 , . • • np)g* 
where 



w(n t , n 2 , ... n„) - 



H! 



n t ! n 2 . , ... ny. 



(4.4) 



is the statistical weight of the ar- 
rangement . If N and v are large numbers 
the statistical weights of some ar- 
rangements are enormously greater than 
most of the rest. The arrangements of 
great statistical weight will therefore 
be those that the gas spends most of 
its time in. Therefore the gas can be 
approximately described as being only 
in its most probable state, ignoring 
the rest. Fluctuations of the distribu- 
tion away from the most probable one 
will occur, analogous to the fluctua- 



tions in the ordinary density treated 
earlier, but we will not consider them 
here . 

Considering only the most probable 
distribution, we seek to maximize the 
statistical weight, Eq. (4.4). The dis- 
tribution is subject to two equations 
of constraint, however. The first is 
that the total number of particles is 
a constant : 



^ni * N, (4 .5) 

i=i 

and the second is that the total en- 
ergy is constant; if the energy corre- 
sponding to the ith cell is c & , 

v 

• £ € i n i ■ U, where € t - p 1 a /2r.. 

1st 

(4.0) 

To maximize w, we first express it in 
more convenient form by taking its log- 
arithm, 

In w — In N! “ ln(n|!n 2 ! ... n p • ) 

v 

- In N’ - 2 In n 4 ! . (4.7) 

i = » 

We use Stirling’s approximation for the 
factorial, Eq. (3.4) (we are assuming 
that each cell in phase space is large 
enough so that it contains enough mol- 
ecules for Stirling's approximation to 
be accurate) . Using the formula for 
In n! . 

In n! ■ (n + i) In n “ n + i In 2f, 

(4.8) 

substituting in the Eq. (4.7) above, 

In w - (N + i) In K + ^ ~ ln 

- ^ (n + £) In n 4 

i = i 

v 

- a constant — 2^ ( n i + ln n i* 

1*1 

( 4 . 9 ) 
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For the most probable arrangement, the 
variation of In w with respect to 
small variations of all the populations 
is zero: 

d(ln w) - 0 - — (In n A + 1 + ^ dn i 

v 

s — 2^ In n 4 dn Jf (4.10) 

if we neglect terms that are small com- 
pared to In n & ; In n A >> 1. 

The small variations in the cell 
populations are constrained to be con- 
sistent with the constant N and U of 
the entire collection of particles, by 
the method of undetermined Lagrance 
multipliers. This is a method which can 
be used when a function of several var- 
iables has either a maximum or a mini- 
mum, and where there are additional 
equations of condition linking the var- 
iables. Each independent equation among 
the variables reduces by 1 the number 
of independent degrees of "freedom" of 
the function. The method of Lagrange 
multipliers is a simple technique for 
bringing the equations of constraint 
into a single functional equation. Its 
application will be seen in our case: 

By differentiating Eq. (4.5) and Eq. 
(4.6) with respect to the populations 
in each cell, 

v v 

£dn £ - 0, and £ € 1 dn 1 - 0, 

1=1 1=1 

we obtain two equations linking the 
variations among the n A . They can be 
incorporated into Eq. (4.10) by multi- 
plying each by an undetermined parame- 
ter and adding all three together ; 

v 

a / dn t .« 0 

i=i 

+ 0 £ € t dn t - 0 
i = i 
v 

+ £ In n t dn t - 0; 

1 = 1 

y](ln n t + or+ ^'£ 1 )dn 1 - 0. 

i ■ i 
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Since this equation does not depend on 
the choice of cell size or the numbers 
of particles in each cell, the term in 
parentheses must vanish for each cell 
independently : 

(In nj + a + 0€ 4 ) - 0. 

Taking ant Hogs 

ni - a exp (4.11) 

-a' 

where a m e 

Equation (4.11) is one of the most 
famous equations in physics. It was ob- 
tained by Boltzmann (1844-1906), in 
1868. The term exp -/3 c A is known as the 
Boltzmann factor, and it has applica- 
bility to a wide variety of systems in 
statistical equilibrium, not only the 
point-mass atoms of a gas. For, if we 
review the analysis leading up to this 
point, we find that it depends on only 
the following conditions and assump- 
tions : 

(a; There are many identical par- 
ticles in statistical equilibrium. 

(b) All volumes of phase space 
are equally probable. 

(c) The probability of occupation 
of a cell is independent of the number 
of particles in the cell. (Note that 
the cell size is arbitrary in our dis- 
cussion.) 

(d) The populations in each cell 
are large enough for Stirling's approx- 
imation to be used. 

These conditions are also satis- 
fied by many other systems, including 
the vibrations of atoms in a crystal, 
the states of atomic moments in a mag- 
netic solid, and the conduction elec- 
trons in a semiconductor. In the pres- 
ent monograph, however, we are re- 
stricting our attention to the case of 
a gas, which will now be examined in 
detail. 



4.5 THE MAXWELL-BOLTZMANN GAS 

Equation (4.11) seems a little ar- 
tificial for the gas of atoms, since it 
describes the distribution in terms of 
the populations of discrete cells in 
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phase spare. These col In arc only an 
aril l'acl wo Introduced to make the var- 
ious regions of the total phase volume 
distinguishable by subscripts. However, 
we now see that the subscripts arc un- 
necessary, since the energy € is all 
that the populations depend on, through 
the Boltzmann factor itself. Although 
the phase space coordinates of the cell 
are not important, the cell volume is: 

By condition (b) above, the population 
of a cell is direct'./ proportional to 
its volume. Therefore, the number of 
atoms in a certain region of p use 
space is proportional to the volume of 
the region and to its Boltzmann factor. 
If we define the population of a cell . 
of energy € and volume Ap^P-APg 
Az Ay Az to oe An, Eq. (4.11) is trans- 
formed to 

An(c, ApgAPy ... Az) 

- a e~^ € Ap x APyAp s Az Ay AZ. (4.12) 

a 

For infinitesimally small volumes, we 
have the differential form 

dn(€,dp,dp y dp,dV) - a e"^ c dp a dp y dp,dV, 

(4.13) 

where dV - dzdydz is an infinitesimal 
volume in coordinate space. 

Now, we can apply Eq. (4.13) di- 
rectly to the case of the point-mass 
gas. Since the atoms are assumed to 
move in a field-free region, the en- 
ergy is not a function of the position 
of the particle. Therefore, the popula- 
tions are also independent of their lo- 
cations in the container, and we can 
combine them by integrating over the 
entire volume V: 

dn(c,dp,dp y dp,) - a Ye ^ € dp,dp y dp«. 

(4.14) 

Since the particles are assumed to col- 
lide infrequently, each atom moves es- 
sentially as an independent particle. 
Therefore, the energy is related to the 
momentum by the free particle ezpres- 



/ 



slon, Eq. (4.3). Substituting for c, 



. (4.14) bccoa 


IO0 






dn(c,dp,dp y dp,) ■ 






ex p[-Klr + 


£♦ 


ft 


]dp,dp y dp, - 


r ( 0 p « a 


)<p,]| 


r i 


(“0Py 2 \ ] 


l exp y- 2 . 


1 exp ! 


Its - 


X [ exp ( 2m ) 


dp,]. 
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This result contains the familiar 
form of the Gaussian distribution: Ve 
see that it is the product of three 
Gaussian distributions, one in each 
momentum variable. By comparing the 
momentum distributions with the general 
ezpression, Eq. (3.17), we can obtain 
the average momentum (p,X and the aver- 
age squared momentum (p, 2 ) by inspec- 
tion. Comparison of the ezponents shows 
that the term which would correspond to 
H in the Eq. (3.17) is absent in each 
factor of the momentum distribution. 
Since fi is equal to the average of the 
variable itself, this means that the 
average of each momentum component van- 
ishes : 

<P,> " <Py> " <P*> * °* (4 * 16) 



It is a reassuring result, since the 
gas was originally presumed to be in 
equilibrium with its stationary con- 
tainer. Comparison of the remaining 
terms of the ezponents yields 



flp, 2 _ P, 2 0Py 2 _ Py 2 0P« 2 _ P» 2 
2m 2o, 2 * 2m 2o y 2 ’ 2m 20 r, a * 

Therefore, o, 2 - a y 2 - o 2 - a 2 m/0. 

(4.17) 

Thus, the dispersion in the momentum 
is the same in each direction. Since 
the average momentum in each direction 
is zero, we have a simple relation be- 
tween the dispersion and the average of 
the squared components. By Eq. (3.12) 
we see that in this case 

- <p m *> - <p,*> - <p.*> - 

(4.18) 
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Therefore, the averages are simply re- 
lated to the Lagrange multiplier 0 and 
the atomic mass. The parameter 0, which 
has been undetermined so l’ar, can now 
be related to physical properties by 
comparison with experiment. For we 
found in Chapter 1 that Charles' law 
implies a simple proportionality be- 
tween the kinetic energy and the tem- 
perature T on the absolute scale. That 
connection leads us to the discovery of 
0: by Eq. (1.12) and Eq. (4.18), we 
have 



<€> 
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And therefore, we obtain the gloriously 
simple and profound relation, 

0 - 1/kT. (4.19) 

Before turning to a discussion of the 
significance of the result, we can ef- 
fect a definition of the otheir Lagrange 
multiplier by a similar comparison with 
the general form of the Gaussian dis- 
tribution. By this comparison it is 
found that 



® * V(2vmkT)** * (4.23) 

Therefore, the basic properties of the 
statistical distribution of momentum 
have been found, and we can write the 
distribution law in terms of identified 
quantities: 

d " ' <2«kT>* eXP [- ^t] dp * dP » dP « 

(4.21) 

where we have written p for the total 
momentum Vp, z + p y z + p z z . 

By a combination of reference to 
experiments on gross volumes of gases, 
a belief that gases are composed of 
atoms, and a statistical analysis of 
their random motions, we have arrived 
at a formula for the velocities of heat 
motion of the atoms in a gas. 

Iquation (4.21) was originally 
derived in 1856 by Clerk Maxwell (1831- 



1879), whose analysis followed very 
different lines. Our derivation is pat- 
terned after Boltzmann's. In either the 
form above or in one of several related 
forms, it is called the Maxwell -Boltz- 
mann distribution law. It was confirmed 
experimentally by several researchers 
in this century: One of the first »as 
Otto Stern, in 1920. Since then, the 
Maxwell -Boltzmann distribution has been 
tested for other "gases," including 
plasmas and neutrons in a reactor. Some 
experimental comparisons are shown in 
Fig. 4.4. 

According to the Maxwell -Boltzmann 
distribution law, the momentum distri- 
bution has a width that is controlled 
by the temperature. If more heat energy 
is given to the atoms, they will, by 
means of their collisions, share it 
among themselves according to statis- 
tical laws within a certain relaxation 
time. The added heat raises the average 
energy of all the atoms, and although 
the average momentum in each direction 
is zero, the average squared momentum 
increases. This increases the disper- 
sion of the distribution, so that the 
variations in the momenta of atoms 
picked at random are greater at higher 
temperatures. We can liken these 
changes with temperature to the varia- 
tions that would be seen in an error 
curve, if the measuring instrument had 
a variable sensitivity. The coarseness 
oJ the "measurement" is an effect of 
the many collisions which make up the 
statistical equilibrium: As the average 
energy increases, so does the disper- 
sion. Cooling the gas improves the pre- 
cision of the "measurement," but it 
also reduces the quantity itself. As 
the temperature is reduced, the curve 
becomes sharper, and if the Maxwell- 
Bolt zma nn distribution were to continue 
to be obeyed at all temperatures, it 
implies that one could measure precise- 
ly (c) » 0 at T - 0. However, long be- 
fore then, real gases liquefy, and the 
assumptions leading to the distribution 
law become poor approximations of the 
real situation. Other forms of heat 
motion take place in liquids and sol- 
ids, and they may be analysed by sta- 
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t 1st leal analyses based on the sane 
approximations we have used here. On 
such a basis, the heat motions In liq- 
uids and solids also vanish at T - 0. 
But It was realized early In the 
twentieth century that several of the 
assumptions of the classical theory are 
not strictly correct. In the form In 



/ 

which they are given following Eq. 
(4.11), they are reasonable approxima- 
tions for relatively high temperatures 
and low densities, but for dense and 
cold materials, the motion must be 
analyzed In terms of quantum mechani- 
cal laws. One Important effect of 
quantum mechanical behavior can be 
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Pig. 4.4 Measured distributions of various 
properties of some systems of weakly inter- 
acting particles, compared with theoretical 
curves derived from the Maxwell -Boltzmann 

law. 

(a) Velocity spectra of protons (a.l) and 
tritons (a. 2) from the d-d reaction in the 
Scylla experiment on fast magnetic compres- 
sion of a plasma, obtained by D. E. Nagle, 

V. E. Quinn, V. B. Re i sen f eld, and V. Le- 
land, Phys. Rev. Letters 3, 318 (1959). The 
curves are drawn according to theoretical 
distributions corresponding to temperatures 
T - 11.6 X 1<P*K (1.0 keV) and 17.4 x 10* ’K 
(1.8 keV). 

(b) Intensity distribution of neutrons enit- 



b 





ted in thermal beam of a reactor, as a func- 
tion of neutron wavelength X(X - h/mv), 
measured by J. G. Dash and H. S. Sommers, 
Jr., Rev. Sci. Instr. 24, 91 (1953). The 
curve is calculated for a beam effusing from 
a Maxwellian source at 320 *K, modified by a 
spectrometer transmission function. 

(c) Measured transmission curve and calcu- 
lated Maxwell -Boltzmann transmission curve 
for a beam of potassium atoms effusing from 
an oven’at 157 *C. Abscissa is approximately 
equal to transit time of atoms through ve- 
locity selector. Measurements were made by 
P. M. Marcus and J. B. McPee, Recent Re- 
search in Molecular Beams , I. Estermann 
(ed.), (Academic Press, Mew York, 1959). 
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stated In closing: The notions at T ■ 0 mininun and unextractable amount re- * 
are small, but not zero. As absolute nains. This so-called zero-point notion 

zero is approached, the atoms slow down implies a residual dispersion in the 

to none minimal momenta and energies., momentum, a direct consequence of the 

When all the heat energy is removed, a laws of quantum mechanics. 
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In this monograph we have at- 
tempted to show the march of scientific 
progress in the field of kinetic theory 
of gases. Experiment confirmed theory 
brilliantly: perhaps the student might 
have expected it. Textbooks and mono- 
graphs tend to have this sort of happy 
ending, a neatly packaged rationaliza- 
tion or thesis. Economy of the teacher's 
and student's time usually demands 
that all of the mistakes and fruitless 
paths in the history of the science must 
must be ignored. This lends an unreal 
and distorted complexion to one's im- 
pressions of the subject and to scien- 
tific achievement in general. The de- 
velopment of our ideas about heat mo- 
tions of atoms is a case in point. From 
the first insights of the Greek atom- 
ists to the experimental confirmation 
of the detailed statistical theory took 
more than 2000 years. During much of 
this time men were attempting in one 
fiy or another to rationalize the uni- 
verse in general and the nature of* fire 
in particular. Most of them were wrong, 
or at least not part of the straight 
line of progress sketched in our de- 
scription. And yet, the distance we 



traveled in two millennia is not so 
enormous that there were no glimmerings 
of modern ideas at the very beginning. 
Here is what Lucretius wrote in the 
first century B.C.: 

These atoms, which are separated 
from each other in the infinite void 
and distinguished from each other in 
shape, size, position and arrange- 
ment, move in the void, overtake 
each other and collide. 

And also 

Observe what happens when sunbeams 
are admitted into a building and 
shed light in shadowy places. You 
will see a multitude of tiny parti- 
cles mingling in a rultitude of ways 
in the empty space within the light 
of the beam, as though continuing in 
everlasting conflict, rushing into 
battle rank upon rank with never a 
moment's pause in a rapid sequence 
of unions and disunions. From this 
you nay picture what it is for the 
atoms to be perpetually tossed about 
in the void. 
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